Mean Field Markov Decision Processes

نویسندگان

چکیده

Abstract We consider mean-field control problems in discrete time with discounted reward, infinite horizon and compact state action space. The existence of optimal policies is shown the limiting problem derived when number individuals tends to infinity. Moreover, we average reward show that policy this limit $$\varepsilon $$ ε -optimal for if large discount factor close one. This result very helpful, because it turns out special case does only depend on distribution individuals, obtain a interesting subclass where an can be obtained by first computing measure from static optimization then achieving Markov Chain Monte Carlo methods. give two applications: Avoiding congestion graph positioning market place which solve explicitly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mean-Variance Optimization in Markov Decision Processes

We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for oth...

متن کامل

Mean Field Approximation of the Policy Iteration Algorithm for Graph-Based Markov Decision Processes

In this article, we consider a compact representation of multidimensional Markov Decision Processes based on Graphs (GMDP). The states and actions of a GMDP are multidimensional and attached to the vertices of a graph allowing the representation of local dynamics and rewards. This approach is in the line of approaches based on Dynamic Bayesian Networks. For policy optimisation, a direct applica...

متن کامل

Energy and Mean-Payoff Parity Markov Decision Processes

We consider Markov Decision Processes (MDPs) with mean-payoff parity and energy parity objectives. In system design, the parity objective is used to encode ω-regular specifications, and the mean-payoff and energy objectives can be used to model quantitative resource constraints. The energy condition requires that the resource level never drops below 0, and the mean-payoff condition requires tha...

متن کامل

Algorithmic aspects of mean-variance optimization in Markov decision processes

We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for oth...

متن کامل

Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes

In this note, we compare two approaches for handling risk-variability features arising in discrete-time Markov decision processes: models with exponential utility functions and mean variance optimality models. Computational approaches for finding optimal decision with respect to the optimality criteria mentioned above are presented and analytical results showing connections between the above op...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Mathematics and Optimization

سال: 2023

ISSN: ['0095-4616', '1432-0606']

DOI: https://doi.org/10.1007/s00245-023-09985-1